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ABSTRACT 


This  survey  covers  the  theory  and  application  of  Lagrangean  tech- 
niques to  discrete  optimization  problems.  A discussion  of  the  applica- 
tions Includes  Integer  programming  special  structures  which  can  be  ex- 
ploited by  Lagrangean  techniques,  multi-item  production  scheduling  and 
Inventory  control  problems,  and  the  traveling  salesman  problem.  The 
relationship  of  Lagrangean  techniques  to  duality  theory  and  convex  analy- 
sis Is  given  Including  a discussion  of  algorithms  to  solve  the  dual 
problems.  Duality  theory  for  Integer  programming  and  its  relationship 
to  the  cutting  plane  method  is  reviewed.  The  use  of  Lagrangean  tech- 
niques In  conjunction  with  branch  and  bound  is  presented  In  a general 
framework  for  solving  discrete  optimization  problems. 


1.  Introduction 


Lagrangean  techniques  were  proposed  for  discrete  optimization  problems 
as  far  back  as  1955  when  Lorle  and  Savage  suggested  a simple  method  for  trying 
to  solve  zero-one  Integer  programming  (IP)  problems.  We  use  their  method  as 
a starting  point  for  discussing  many  of  the  developments  since  then.  The 
behavior  of  Lagrangean  techniques  In  analyzing  and  solving  zero-one  IP  problems 
Is  typical  of  their  use  on  other  discrete  optimization  problems  discussed  In 
later  sections. 

Specifically,  consider  the  zero-one  IP  problem 


V = min  cx 

s.t.  Ax  < b 


(1) 


= 0 or  1. 


Let  Cj  denote  a component  of  c,  a^  a column  of  A with  components  . and  b^ 
a component  of  b.  Letting  u represent  a non-negative  vector  of  Lagrange 
multipliers  on  the  right  hand  side  b,  the  method  proceeds  by  computing  the 
Lagrangean 


L°(u)  = -ub  + minimum  {(c-HiA)x}. 

Xj  * 0 or  1 


(2) 


The  function  L°(u)  Is  clearly  optimized  by  any  zero-one  solution  x satisfying 


0 


If  Cj+uaj  > 0 


0 or  1 If  Cj+uSj  = 0 

1 If  Cj+uSj  < 0 


(3) 


In  the  Introduction,  we  pose  and  discuss  a number  of  questions  about  this 
method  and  Its  relevance  to  optimizing  the  original  IP  problem  (1) . In 
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several  Instances,  we  will  state  results  without  proof.  These  results  will 
either  be  proven  In  later  sections,  or  reference  will  be  given  to  papers  con- 
taining the  relevant  proofs. 

When  is  a zero-one  solution  x which  is  optimal 
in  the  Lagrangean  also  optimal  in  the  given  IP 
problem? 

In  order  to  answer  this  question,  we  must  recognize  that  the  underlying 
goal  of  Lagrangean  techniques  Is  to  try  to  establish  the  following  sufficient 
optimality  conditions. 

OPTIMALITY  CONDITIONS:  The  pair  (x,u) , where  x Is  zero-one  and  u>0.  Is  said 
to  satisfy  the  optimality  conditions  for  the  zero-one  IP  problem  (1)  if 

(1)  L°(u)  = -ub  + (c+uA)x 

(11)  u(Ax-b)  = 0 

(111)  Ax  < b. 

It  can  be  shown  that  if  the  zero-one  solution  x satisfies  the  optimality 
conditions  for  some  u,  then  x is  optimal  in  problem  (1).  This  result  Is 
demonstrated  In  greater  generality  in  section  3.  The  implication  for  the 
Lagrangean  analysis  is  that  x computed  by  (3)  is  optimal  In  problem  (1)  If 
It  satisfies  Ax  < b with  equality  on  rows  where  u^  > 0. 

Of  course,  we  should  not  expect  that  x computed  by  (3)  will  even  be 
feasible  in  (1),  much  less  optimal.  According  to  the  optimality  conditions, 
however,  such  an  x Is  optimal  In  any  zero-one  IP  problem  derived  from  (1) 
by  replacing  b with  Ax  + 6 where  6 is  any  non-negative  vector  satisfying 
6^  = 0 for  1 such  that  u^^  >0.  This  property  of  x makes  Lagrangean  tech- 
niques useful  in  computing  zero-one  solutions  to  IP  problems  with  soft  con- 
straints or  In  parametric  analysis  of  an  IP  problem  over  a family  of  right 
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hand  sides.  Parametric  analysis  of  discrete  optimization  problems  Is  dis- 
cussed again  In  section  6. 


How  should  the  vector  u of  Lagrange  multipliers  be 
selected?  Can  we  guarantee  that  there  will  be  one 
which  produces  an  optimal  solution  to  the  original 
IP  problem? 


An  arbitrary  u falls  to  produce  an  optimal  x because  ^a^^x^  > b^  on 

some  rows  or  ^a^^Xj  < b^  on  other  rows  with  u^  > 0.  In  order  to  change  u 
so  that  the  resulting  x Is  closer  to  being  feasible  and/or  optimal,  we  could 
consider  Increasing  u^  on  the  former  rows  and  decreasing  u^  on  the  latter 
rows.  A convergent  tatonnement  approach  of  this  type  Is  non-trlvlal  to 
construct  because  we  must  simultaneously  deal  with  desired  changes  on  a 
number  of  rows.  Systematic  adjustment  of  u can  be  achieved,  however,  by 
recognizing  that  there  Is  a dual  problem  and  a duality  theory  underlying  the 
Lagrangean  techniques.  We  discuss  this  point  here  briefly  and  In  more  detail 
In  section  3. 

For  any  u i 0,  It  can  easily  be  shown  that  L°(u)  Is  a lower  bound  on  v, 
the  minimal  IP  objective  function  cost  In  (1).  The  best  choice  of  u Is  any 
one  which  yields  the  greatest  lower  bound,  or  equivalently,  any  u which  Is 
optimal  In  the  dual  problem 


jO  Y O/  \ 

d = max  L (u) 


s.t.  u 2 0. 


(4) 


The  reason  for  this  choice  Is  that  If  u can  yield  by  (3)  an  optimal  x to 
the  primal  problem  (1) , then  u Is  optimal  In  (4) . The  validity  of  this 
statement  can  be  verified  by  direct  appeal  to  the  optimality  conditions  using 
the  weak  duality  condition  L°(u)  < v for  any  u i 0.  Thus,  a strategy  for 
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trying  to  solve  the  primal  problem  (1)  Is  to  compute  an  optimal  solution 
u to  the  dual  problem  (4),  and  then  try  to  find  a complementary  zero-one 
solution  X for  which  the  optimality  conditions  hold. 

A fundamental  question  about  Lagrangean  techniques  Is  whether  or  not 
an  optimal  dual  solution  to  (4)  can  be  guaranteed  to  produce  an  optimal 
solution  to  the  primal  IP  problem  (1).  It  turns  out  that  the  answer  Is  no, 
although  fall-safe  methods  exist  and  will  be  discussed  for  using  the  dual  to 
solve  the  primal.  If  (4)  cannot  produce  an  optimal  solution  to  (1),  we  say 
there  Is  a duality  gap. 

Insight  Into  why  a duality  gap  occurs  Is  gained  by  observing  that 
problem  (4)  Is  equivalent  to  the  LP  dual  of  the  LP  relaxation  of  (1)  which 
results  by  replacing  Xj  = 0 or  1 by  the  constraints  0 < x^  <1.  This  was 
first  pointed  out  by  Nemhauser  and  Ullman  (1968).  Here  we  use  the  term 
relaxation  In  the  formal  sense;  that  Is,  a mathematical  programming  problem 
Is  a relaxation  of  another  given  problem  If  Its  set  of  feasible  solutions 
contains  the  set  of  feasible  solutions  to  the  given  problem.  The  fact  that 
duallzatlon  of  (1)  Is  equivalent  to  convexlf Icatlon  of  It  Is  no  accident 
because  the  equivalence  of  these  two  operations  Is  valid  for  arbitrary 
mathematical  programming  problems  (see  Magnantl,  Shapiro  and  Wagner  (1976)). 

For  discrete  optimization  problems,  the  convexlf led  relaxations  are  LP 
problems.  Geoffrlon  (1974)  has  used  the  expression  Lagrangean  relaxation  to 
describe  this  equivalence.  Insights  and  solution  methods  for  the  primal  problem 
are  derived  from  both  the  duallzatlon  and  convexlf Icatlon  viewpoints. 

How  should  the  dual  problem  be  solved? 


We  remarked  above  that  problem  (4)  Is  nothing  more  than  the  dual  to 
the  ordinary  LP  relaxation  of  (1).  Thus,  a vector  of  optimal  dual  variables 
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could  be  calculated  by  applying  the  simplex  algorithm  to  the  LP  relaxation 
of  (1) . The  use  of  Lagrangean  techniques  as  a distinct  approach  to  discrete 
optimization  has  proven  theoretically  and  computationally  Important  for  three 
reasons.  First,  dual  problems  derived  from  more  complex  discrete  optimization 
problems  than  (1)  can  be  represented  as  LP  problems,  but  ones  of  Immense 
size  which  cannot  be  explicitly  constructed  and  then  solved  by  the  simplex 
method.  Included  In  this  category  are  dual  problems  more  complex  than  (4) 
derived  from  (1)  when  (4)  fails  to  solve  (1) . These  are  discussed  in  sections 
2 and  4.  From  this  point  of  view,  the  Lagrangean  techniques  applied  to  dis- 
crete optimization  problems  are  a special  case  of  dual  decomposition  methods 
for  large  scale  LP  problems  (e.g.,  see  Lasdon(1970)) . 

A second  reason  for  considering  the  application  of  Lagrangean  techniques 
to  dual  problems,  In  addition  to  the  simplex  method.  Is  that  the  simplex 
method  Is  exact  and  the  dual  problems  are  relaxation  approximations.  It  Is 
sometimes  more  effective  to  use  an  approximate  method  to  compute  quickly  a 
good,  but  non-optimal,  solution  to  a dual  problem.  In  section  3,  we  consider 
alternative  methods  to  the  simplex  method  for  solving  dual  problems  and  dis- 
cuss their  relation  to  the  simplex  method.  The  underlying  idea  is  to  treat 
dual  problems  as  nondlf ferentlable  steepest  ascent  problems  taking  Into 
account  the  fact  that  the  Lagrangean  L°  is  concave. 

Lagrangean  techniques  as  a distinct  approach  to  discrete  optimization 
problems  emphasizes  the  need  they  satisfy  to  exploit  special  structures  which 
arise  in  various  models.  This  point  is  discussed  in  more  detail  in  section  2. 

(/hat  should  be  done  if  the-pe  is  a duality  gap? 


As  we  shall  see  In  section  3,  a duality  gap  manifests  Itself  by  the  com- 
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putatlon  of  a fractional  solution  to  the  LP  relaxation  of  problem  (1) . When 
this  occurs,  there  are  two  complementary  approaches  which  permit  the  analysis 
of  problem  (1)  to  continue  and  an  optimal  solution  to  be  calculated.  One 
approach  Is  to  branch  on  a variable  at  a fractional  level  In  the  LP  relaxation; 
namely,  use  branch  and  bound.  The  Integration  of  Lagrangean  techniques  with 

branch  and  bound  Is  given  In  section  5.  The  other  approach  Is  to  strengthen  the 
dual  problem  (4)  Is  by  restricting  the  solutions  permitted  In  the  Lagrangean 
minimization  to  be  a strict  subset  of  the  zero-one  solutions.  This  Is 
accomplished  In  a systematic  fashion  by  the  use  of  group  theory  and  Is  dis- 
cussed In  section  4. 

2.  Exploiting  Special  Structures 

Lagrangean  techniques  can  be  used  to  exploit  special  structures  arising 
in  IP  and  discrete  optimization  problems  to  construct  efficient  computational 
schemes.  Moreover,  identification  and  exploitation  of  special  structures 
often  provide  Insights  into  how  discrete  optimization  models  can  be  extended 
In  new  and  richer  applications. 

The  class  of  problems  we  consider  first  is 

V = min  cx 

s . t . Ax  < b (5) 

xeX  - r". 


where  X is  a discrete  set  with  special  structure.  For  example,  X may  consist 
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of  the  multiple-choice  constraints 


X,  = 1 for  all  k 
Xj  = 0 or  1. 


(6^ 


where  the  sets  Jj^  are  disjoint.  Another  example  occurs  when  X corresponds  to 
a network  optimization  problem.  In  this  case,  the  representation  of  X can 

either  be  as  a totally  unimodular  system  of  linear  Inequalities  , or  as 
a network.  Other  IP  examples  are  discussed  by  Geoffrlon  (1974) . 

The  Lagrangean  derived  from  (5)  for  any  u>0  is 


L(u)  = -ub  + min(c4aiA)x. 
xeX 


(7) 


We  expect  L(u)  to  be  much  easier  to  compute  than  v because  of  the  special 
form  of  X.  Depending  on  the  structure  of  X,  the  specific  algorithm  used 
to  compute  L may  be  a "good"  algorithm  in  the  terminology  of  Edmonds  (1971) 
or  Karp  (1975);  that  is,  the  number  of  elementary  operations  required  to 
compute  L(u)  is  bounded  by  a polynomial  of  parameters  of  the  problem.  Even 
if  it  is  not  "good"  in  a strictly  theoretical  sense,  the  algorithm  may  be 
quite  efficient  empirically  and  derived  from  some  simple  dynamic  programming 
recursion  or  list  processing  scheme.  Examples  will  be  given  later  in  this 
section.  Finally,  in  most  instances  the  x calculations  in  (7)  will  be  integer 
and  may  provide  a useful  starting  point  for  heuristic  methods  to  compute  good 
solutions  to  (5). 

Most  discrete  optimization  problems  stated  in  general  terms  can  be 
formulated  as  IP  problems,  although  sometimes  with  difficulty  and  ineffi- 
ciently. We  illustrate  with  two  examples  how  Lagrangean  techniques  are 
useful  in  handling  special  structures  which  are  poorly  represented  by  systems 
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of  linear  Inequalities. 

We  consider  a manufacturing  system  consisting  of  I Items  for  which  pro- 
duction Is  to  be  scheduled  at  minimum  cost  over  T time  periods.  The  demand 
for  Item  1 In  period  t Is  the  nonnegative  Integer  this  demand  must  be 

met  by  stock  from  Inventory  or  by  production  during  the  period.  Let  the 
variable  denote  the  production  of  item  1 in  period  t.  The  Inventory  of 
item  1 at  the  end  of  period  t is 


S’!!  ■ * *11  - ■'it 


t“l T 


where  we  assume  y^  ^ * 0,  or  equivalently,  initial  Inventory  has  been 
netted  out  of  the  Associated  with  is  a direct  unit  cost  of  pro- 
duction Similarly,  associated  with  y^^^.  Is  a direct  unit  cost  of 

holding  Inventory  h^^.  The  problem  is  complicated  by  the  fact  that  positive 
production  of  Item  1 In  period  t uses  up  a quantity  a^  + of  ® scarce 

resource  q^  to  be  shared  among  the  I Items.  The  parameters  a^  and  b^  are 
assumed  to  be  nonnegative.  The  use  of  Lagrangean  techniques  on  this  type  of 
problem  was  originally  proposed  by  Manne  ( 195^  . The  model  and  analysis  was 
extended  by  D.  Zielinski  and  Gomory  (1965)  and  has  been  applied  by  Lasdon  and 
Terjung  (1971). 


This  problem  can  be  written  as  the  mixed  Integer  programming  problem 
I T 

V = minimum  J J 
1=1  t=l 


(8a) 


s.t.  (Vit'^Vit^  - ‘It’  t 


(8b) 
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for  i - 1 1 


Ji  ■ til 


(8c) 


s - 1, . . . ,T 


*lt  ^ 


> 0 . > 0 


(8d) 

(8e) 


T 

where  = I r^^  Is  an  upper  bound  on  the  amount  we  would  want  to  produce  of  1 

S“t 

In  period  t.  The  constraints  (8b)  state  that  shared  resource  usage  cannot 
exceed  q^.  The  constraints  (8c)  relate  accumulated  production  and  demand 
through  period  t to  ending  Inventory  In  period  t,  and  the  nonnegativity  of 
the  y^^  Implies  demand  must  be  met  and  not  delayed  (backlogged) . The  con- 
straints (8d)  ensure  that  6^^  » 1,  and  therefore  the  fixed  charge  resource 
usage  a^  Is  Incurred,  If  production  x^^^  Is  positive  In  period  t.  Problem  (8) 

Is  a mixed  Integer  programming  problem  with  IT  zero-one  variables , 2IT  con- 
tinuous variables  and  T + 2IT  constraints.  For  the  application  of  Lasdon 
and  Terjung  (1971),  these  figures  are  240  zero-one  variables,  480  continuous 
variables,  and  486  constraints  which  Is  a mixed  Integer  programming  problem 
of  significant  size. 

For  future  reference,  define  the  set 


^^'^It’^'lt’^lt^  ’ ^ 


1,...,t|6 


It’^lt’^lt 


satisfy  (8c),  (8d) , (8e) } . 


(9) 


This  set  describes  a feasible  production  schedule  for  Item  1 Ignoring  the 
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joint  contraints  (8b).  The  Integer  programming  formulation  (8)  is  not 
effective  because  it  fails  to  exploit  the  special  network  structure  of  the 
sets  N^.  This  can  be  accomplished  by  Lagrangean  techniques  as  follows. 
Assign  Lagrange  multipliers  i 0 to  the  scarce  resources  and  place 
the  constraints  (8b)  in  the  objective  function  to  form  the  Lagrangean 


b(u)  = - I u q 
t=l 


minimum 


I,  ^(‘=it'*^t^i^’'it  + “t^^it  ^t^lt^' 


Letting 


L^(u) 


minimum 

(6lt.Xit.yit)eNi 


T 

^^^It-^tV^it 


+ u^a^6^^  + ^t^it^’ 


(10) 


the  Lagrangean  function  clearly  separates  to  become 
T I 

L(u)  = - I u q + I L, (u). 
t=l  ^ 1=1 

Each  of  the  problems  (10)  is  a simple  dynamic  programming  shortest-route 
calculation  for  scheduling  item  i where  the  Lagrange  multipliers  on  shared 
resources  adjust  the  costs  as  shown.  Notice  that  it  is  easy  to  add  any 
additional  constraint  on  the  problem  of  scheduling  item  1 which  can  be 
accommodated  by  the  network  representation;  for  example,  permitting  pro- 
duction in  period  t only  if  inventory  falls  below  a preassigned  level. 

Unfortunately,  we  must  give  up  something  in  using  Lagrangean  techniques 
on  the  mixed  IP  (8)  to  exploit  the  special  structure  of  the  sets  N^.  In 
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the  context  of  this  application,  the  optimality  conditions  we  seek  but  may 
not  achieve  Involve  Lagrange  multipliers  which  permit  each  of  the  I Items  to 
be  separately  scheduled  by  the  dynamic  programming  calculation  while 
achieving  a global  minimum.  As  we  see  In  the  next  section,  this  can  be  at 
least  approximately  accomplished  If  the  number  of  joint  constraints  Is  small 
relative  to  I. 


In  summary,  the  application  of  Lagrangean  techniques  just  discussed 
Involves  the  synthesis  of  a number  of  simple  dynamic  programming  models  under 
joint  constraints  into  a more  complex  model.  In  a similar  fashion,  Fisher  (1973) 
applied  Lagrangean  techniques  to  problems  where  a nundier  of  jobs  are  to  be 
scheduled,  each  according  to  a precedence  or  CPM  network,  and  the  joint  con- 
straints are  machine  capacity.  Another  example  Is  the  cutting  stock  problem 
of  Gilmore  and  Gomory  (1963).  In  this  model,  a knapsack  problem  Is  used  to 
generate  cutting  patterns  and  the  joint  constraints  are  on  demand  to  be  satis- 
fied by  some  combination  of  the  patterns  generated. 

The  traveling  salesman  problem  Is  a less  obvious  case  where  an  underlying 
graph  structure  can  be  exploited  to  provide  effective  computational  procedures. 
The  problem  Is  defined  over  a complete  graph  g with  n nodes  and  symmetric 
lengths  c^j  * *^ji  edges  <i,j>.  The  objective  is  to  find  a minimum 

length  tour  of  the  n nodes,  or  In  other  words,  a simple  cycle  of  n edges  and 
minimal  length.  This  problem  has  several  IP  formulations  Involving 


variables  x^^  for  the  n 


edges  <i,j>  in  the  complete  graph. 


One  such  IP  formulation  consists  of  approximately  2 constraints 


ensuring  for  feasible  subgraphs  of  n edges  that  (1)  the  degree  at  each  node  Is 
2 and  (11)  no  cycle  Is  formed  among  a subset  of  the  nodes  excluding  node  1.  The 
set  of  subgraphs  of  n edges  satisfying  (11)  has  a very  efficient  characterization. 
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A 1-tree  defined  on  the  graph  g is  a subgraph  which  Is  a tree  on  the  nodes 
2,...,n  and  which  Is  connected  to  node  1 by  exactly  two  edges.  The  collection 
of  subgraphs  of  n edges  satisfying  (11)  Is  precisely  the  set  t of  l-trees. 

Thus,  the  traveling  salesman  problem  can  be  written  as  the  IP  problem 

n-1  n 


“In  I I 
1=1  j=i+l 


‘"ij'^lj 


s.t.  J Xj^,  2 1=1 n 

k<i  j>l 

n(n-l) 

xex  - R ^ . 


(11) 


The  Implication  of  the  formulation  (11) » however,  is  that  we  wish  to  deal  with  the 
1-tree  constraints  implicitly  rather  than  as  a system  of  linear  Inequalities 
Involving  the  zero-one  variables  x^^j  • 

Held  and  Karp  (1970)  discovered  this  partitioning  of  the  traveling 
salesman  problem  and  suggested  the  use  of  Lagrange  multipliers  on  the  degree 


constraints.  For  usR",  the  Lagrangean  is 


n-1  n 


L(u)  = -2  J n.  + minimum  ^ 'l  (c  , + u,  + u ) 

J 1 ^ i-i  A i 


1=1 


X£T 


1=1  j=i+l 


(12) 


This  calculation  Is  particularly  easy  to  perform  because  It  Is  essentially 
the  problem  of  finding  a minimum  spanning  tree  in  a graph.  A "greedy"  or 
"myopic"  algorithm  is  available  for  this  problem  which  is  "good"  in  the  theo- 
retical sense  and  very  efficient  empirically  (Kruskal(1956)  and  Edmonds (1971) ) . 

The  traveling  salesman  problem  Is  only  a substructure  arising  In 
applications  of  discrete  optimization  including  vehicle  routing  and  chemical 
reactor  sequencing.  Lagrangean  techniques  can  be  used  to  synthesize  the 
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routing  or  scheduling  problems  Into  a more  complex  model.  In  a similar 
application,  Gomory  and  Hu(1964)  discovered  the  Importance  of  the  spanning 
tree  as  a substructure  arising  In  the  synthesis  of  communications  networks. 
Specifically,  a maximum  spanning  tree  problem  Is  solved  to  determine  the 
capacities  In  comcn.mlcatlons  links  required  for  the  attainment  of  desired 
levels  of  flows.  Lagrangean  techniques  can  be  used  to  Iteratively  select 
the  spanning  tree  on  which  to  perform  the  analysis  until  the  communications 
problem  Is  solved  at  minimum  cost. 

All  of  the  special  structures  discussed  above  arise  naturally  In 
applications.  By  contrast,  a recent  approach  to  IF  Involves  the  construction 
of  a special  structure  which  we  use  as  a surrogate  for  the  constraints 
Ax  ^ b.  The  approach  requir*»s  that  A and  b In  problem  (1)  have  intejer 


coefficients;  henceforth  we  assume  this  to  be  the  case.  For  exposltlonal 
convenience,  we  rewrite  the  Inequalities  as  equalities  Ax  + Is  = b where 
now  we  require  the  slack  variables  to  be  Integer  because  A and  b are  Integer. 

The  system  Ax  + is  = b Is  aggregated  to  form  a system  of  linear  con- 
gruences which  we  view  as  an  equation  defined  over  a finite  abelian  group. 

The  Idea  of  using  group  theory  to  analyze  IP  problems  was  first  suggested 
by  Gomory  (1965),  although  his  specific  approach  was  very  different.  We 
follow  here  the  approach  of  Bell  and  Shapiro  (1976).  Specifically,  consider 


the  abelian  group  G = Z ® Z 0. . .0  Z 


where  the  q^  are  Integers  greater 


than  1,  Z Is  the  cyclic  group  of  order  q.  and  denotes  direct  sum. 

‘*1  ^1 


Let  Z™  denote  the  set  of  all  Integer  m-vectors,  and  construct  a homomor- 
phism  (p  from  7™  into  G as  follows.  For  each  row  i,  we  associate  an  element 


I 


14 


m 


E of  G and  for  any  zeZ™,  (J>(z)  » E apply  to  both  sides  of  the 

1=1  ^ ^ 

linear  system  Ax  +Is  = b to  aggregate  it  Into  the  group  equation 
n m 

I “4X4  + I e.  s.  = 3,  where  a,  = (J>(a.),  3 = (Kb).  It  is  easy  to  see 

j»l  J J 1-1  1 ^ J J 

that  any  Integer  x,s  satisfying  Ax  + Is  = b also  satisfies  the  group  equation. 

Therefore,  we  can  add  the  group  equation  to  the  zero-one  IP  problem  (1) 
without  eliminating  any  feasible  solutions.  This  gives  us 


min  cx 

s.t.  Ax  + Is  = b 

n m 

I “4X4  + I e^s  = 3 
=1  J J i=l 


j 


Xj  = 0 or  1.  s^  = 0,1.2,, 


For  future  reference,  let 

Y = {(x,s)  I (x,s)  satisfies  (13c)  and  (13d)}. 

Lagrangean  techniques  are  applied  by  dualizing  with  respect  to  the 
original  constraints  Ax  + Is  = b.  For  u>0,  this  gives  us  the  Lagrangean 

L(u)  = -ub  + minimum  {(c+uA)x  + us) 

(x,s) eY 

The  calculation  (14)  can  be  carried  out  quite  efficiently  by  list  pro- 
cessing algorithms  with  the  computation  time  determined  mainly  by  the 
order  of  the  group.  (See  Shapiro  (1968),  Glover  (1969),  Gorry,  Northrop 
and  Shapiro  (1973).)  It  is  easy  to  see  that  for  a non-trivial  group  G 


(13a) 

(13b) 

(13c) 

(13d) 


(14) 
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(l.e.,  |g|  > 2),  the  Lcgrangean  L gives  higher  lower  bounds  than  L°  from 
section  1 since  not  all  zero-one  solutions  x are  Included  In  Y for  some  value 
of  s.  The  selection  of  G and  homomorphism  Is  discussed  again  In  section  4. 

We  have  not  attempted  to  be  exhaustive  In  our  discussion  of  the  various 
discrete  optimization  models  for  which  Lagrangean  techniques  have  been  suc- 
cessfully applied.  Lagrangean  techniques  have  also  been  applied  to  scheduling 
nuclear  reactors  (Mukstadt(1977)) , the  generalized  assignment  problem  (Ross 
and  Soland(1975))  and  multi-commodity  flow  problems  (Held,  Wolfe  and 
Crowder (1974)) . 

3.  Duality  Theory  and  the  Calculation  of  Lagrange  Multipliers 

Implicit  In  the  use  of  Lagrangean  techniques  Is  a duality  theory  for 
the  optimal  selection  of  the  multipliers.  We  study  this  theory  by  considering 
the  discrete  optimization  problem  In  the  general  form 

V = min  f(x) 

s.t.  g(x)  < b (15) 

X e X £ r", 

where  f Is  a scalar  valued  function  defined  on  R^,  g Is  a function  from  r" 

to  r”*,  and  X Is  a discrete  set.  If  there  Is  no  x e X satisfying  g(x)  < b,  we 

take  V * With  very  little  loss  of  generality,  we  assume  that  X Is  a finite 

t T 

set;  say  X » {x  } ..  Implicit  In  this  formulation  is  the  assumption  that 

t-1 

the  constraints  g(x)  ^ b make  the  problem  substantially  more  difficult  to 
solve.  Lagrangean  techniques  are  applied  by  putting  non-negative  multipliers 


i 
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I 

I 


L 

i 

1 

t 

t 


I 

! 

f 


r 


on  the  constraints  g(x)  ^ b and  placing  them  In  the  objective  function. 
Thus,  the  Lagrangean  function  derived  from  0.5  ) Is 


L(u)  = -ub  + minimum  {f(x)  + ug(x) } 
xeX 

“ -ub  + minimum  {fCx*")  + ug(x^)  } 
t-1, . . . ,T 


(16) 


As  we  saw  In  the  previous  section,  the  specific  algorithm  used  to  compute  L 
may  be  a "good"  algorithm,  but  even  If  It  Is  not  good,  the  Intention  Is  that 
It  Is  quite  efficient  empirically  and  derived  from  a simple  dynamic  programming 
recursion  or  list  processing  scheme.  Since  X Is  finite,  L Is  real  valued  for 
all  u.  Moreover,  It  Is  a continuous,  but  nondlfferentlable,  concave  function 
(Rockafellar  (1970)). 


The  combinatorial  nature  of  the  algorithms  used  In  the  Lagrangean  calcu- 
lation Is  a main  distinguishing  characteristic  of  the  use  of  Lagrangean  tech- 
niques In  discrete  optimization.  This  Is  In  contrast  to  the  application  of 
Lagrangean  techniques  In  nonlinear  programming  where  f and  g are  differentiable, 
X * r”  and  the  Lagrangean  Is  minimized  by  solving  the  nonlinear  system 
Vf(x)  + uVg(x)  =0.  A second  distinguishing  characteristic  of  the  use  of 
Lagrangean  techniques  In  discrete  optimization  Is  the  non-dlf ferentlablllty  of 


L,  due  to  the  discreteness  of  X.  This  makes  the  dual  problem  discussed  below 
a non-dlf ferentlable  optimization  problem. 

As  It  was  for  the  zero-one  IP  problem  discussed  In  the  Introduction,  the 
selection  of  u In  the  Lagrangean  Is  dictated  by  our  desire  to  establish  suf- 


il 

flclent  optimality  conditions  for  (15).  j 

OPTIMALITY  CONDITIONS:  The  pair  (x,u),  where  xeX  and  ukO,  Is  said  to  satisfy  ] 


) 
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the  optimality  conditions  for  the  discrete  optimization  problem  (15)  If 

(1)  L(u)  ■ -ub  + f(x)  + ug(x) 

(11)  u(g(x)  - b)  » 0 
(ill)  g(x)  i b. 

Theorem  1:  If  (x,u)  satisfy  the  optimality  conditions  for  the  discrete 
optimization  problem  (15),  then  x Is  optimal  In  (15). 

Proof:  The  solution  5c  is  clearly  feasible  In  (l5)  since  5ceX  and  g(x)  < b 

by  condition  (ill).  Let  SeX  be  any  other  feasible  solution  in  (15).  Then 
by  condition  (1), 

L(u)  * -ub  + f(5c)  + ug(x)  < -ub  + f(x)  + ug(x)  < f(x), 

where  the  final  inequality  follows  because  uiO  and  g(x)  - biO.  But  by  condi- 
tion (11),  L(u)  =•  f(5c)  and  therefore  f(5c)  < f(x)  for  all  feasible  x.  ] | 

Implicit  in  the  proof  of  theorem  1 was  a proof  of  the  following  Important 
result. 

Corollary  1 (weak  duality).  For  any  uiO,  L(u)  i v. 

Our  primary  goal  In  selecting  u is  to  find  one  providing  the  greatest  lower 
bound,  or  in  other  words,  one  which  is  optimal  In  the  dual  problem 

d = max  L(u)  (17) 

s . t . u^O 

Corollary  2.  If  (x,u)  satisfy  the  optimality  conditions  for  the  discrete 
optimization  problem  (15),  then  u Is  optimal  In  the  dual  problem  (17). 

Proof:  We  have  L(u)  * -ub  + f(5c)  + ug(x)  * f(x)  * v by  theorem  1.  Since 
L(u)  < V for  all  uiO  by  corollary  1,  we  have  L(u)  < L(u)  for  all  u>0. | | 

Thus,  the  Indicated  strategy  for  the  application  of  Lagrangean  tech- 
niques is  to  first  find  an  optimal  u In  the  dual  problem.  Once  this  has  been 
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done,  we  then  try  to  find  a complementary  xeX  for  which  the  optimality  conditions 
hold  by  calculating  one  or  more  solutions  x satisfying  L(u)  = -ub+f Cx)+ug(x) . 

There  is  no  guarantee  that  this  strategy  will  succeed  because  (a)  there  may 
be  no  u optimal  in  the  dual  for  which  the  optimality  conditions  can  be  made  to 
hold  for  some  xcX;  (b)  the  specific  optimal  u we  calculated  does  not  admit 
the  optimality  conditions  for  any  xeX;  or  (c)  the  specific  x (or  x*s)  in  X 
selected  by  minimizing  the  Lagrangean  do  not  satisfy  the  optimality  conditions 
although  some  other  xeX  which  is  minimal  in  the  Lagrangean  does  satisfy  them. 

Lagrangean  techniques  can  be  applied  in  a fall-safe  manner,  however,  if 
they  are  embedded  in  branch  and  bound  searches.  This  is  discussed  in  section  5. 
Alternatively,  for  some  discrete  optimization  problems,  it  is  possible  to 
strengthen  the  dual  problem  if  It  fails  to  yield  an  optimal  solution  to  the 
primal  problem.  Under  certain  conditions,  the  dual  can  be  successively 
strengthened  until  the  optimality  conditions  are  found  to  hold.  This  is  dis- 
cussed in  section  4. 

For  any  u > 0,  it  is  easy  to  see  by  direct  appeal  to  the  optimality 
conditions  that  x satisfying  L(u)  = -ub+f(x)  + ug(x)  is  optimal  in  (15) 
with  b replaced  by  g(x)  + 6 where  6 is  non-negative  and  satisfies 
6^  = 0 if  > 0.  Thus,  Lagrangean  techniques  can  be  used  in  a heuristic 
manner  to  generate  approximately  optimal  solutions  to  (15)  when  the  constraints 
g(x)  < b are  soft.  Even  if  these  constraints  are  not  soft,  heuristic  methods 
exploiting  the  specific  structure  of  (15)  can  be  applied  to  perturb  an 
Infeasible  x which  almost  satisfies  the  constraints  to  try  to  find  a good 
feasible  solution.  D'Aversa(1977)  has  had  success  with  this  approach  on  IP 


problems . 
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There  are  two  distinct  but  related  approaches  to  solving  (17);  It  can 
be  viewed  as  a steepest  ascent,  nondlfferentlable  optimization  problem,  or 
as  a large  scale  linear  programming  problem.  We  discuss  first  the  steepest 
ascent  approach.  Our  development  is  similar  to  that  given  In  Fisher,  Northup 
and  Shapiro  (1975);  see  also  Grinold  (1970),  (1972)  and  Shapiro  (1977). 

Although  L Is  not  everywhere  differentiable,  ascent  methods  can  be  constructed 
using  a generalization  of  the  gradient.  An  m-vector  y Is  called  a subgradient 
of  L at  u if 

L(u)  < L(u)  + (u-u)y  for  all  u. 

For  any  subgradient,  it  can  easily  be  shown  that  the  half  space  {ul(u-u)YiO} 
contains  all  solutions  to  the  dual  with  higher  values  of  L.  In  other  words, 
any  subgradient  appears  to  point  In  a direction  of  ascent  of  L at  u.  A readily 
available  subgradient  Is 

Y = g(x)  - b (18) 

where  x is  any  solution  in  X satisfying  L(u)  = -ub+f (x)+ug(x) . If  there 
is  a unique  xeX  minimizing  L at  u,  then  L Is  differentiable  there  and  y is 
the  gradient. 

The  subgradient  optimization  method  (Held  and  Karp  ^1971),  Held,  Wolfe 
and  Crowder  (1974))  uses  these  subgradients  to  generate  a sequence  {u  } of 
non-negative  solutions  to  (17)  by  the  rule 

k+1  fn  k . Q k,  . - 

Uj^  “ max  {0,Uj^  + 1=1,..., m 

where  y is  any  subgradient  selected  In  (18)  and  0^^  > 0 Is  the  step  length. 

For  example,  if  obeys  0j^  -►  0+  and  10^^  -►  +“,  then  it  can  be  shown  that 

If 

L(u  ) -►  d (Poljak  (1967)).  Alternatively,  finite  convergence  to  any  target 


A 


I 
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value  d<d  can  be  achieved  if 


aj^(3-L(u*")) 
I \J-\  12 


(19) 


where  | |u^|  | denotes  Euclidean  norm  and  ^1  ^ ^2  ^ 

The  latter  choice  of  0j^  is  usually  an  uncertain  calculation  in  practice, 
however,  because  the  value  d is  not  known  and  therefore  a target  value  d<d 
cannot  be  chosen  with  certainty. 


f 

I- 


There  is  no  guarantee  when  using  subgradient  optimization  that 
k+1  k 

L(u  ) > L(u  ) although  practice  has  shown  that  Increasing  lower  bounds  can 
be  expected  on  most  steps  under  the  correct  combination  of  artistic  expertise 
and  luck.  Thus,  subgradient  optimization  using  the  rule  (19)  for  the  step 
length  is  essentially  a heuristic  method  with  theoretical  as  well  as  empirical 


justification.  It  can  be  combined  with  convergent  ascent  methods  for  solving 
(17)  based  on  the  simplex  method  which  we  now  discuss. 


t 

' The  dual  problem  (17)  is  equivalent  to  the  LP  problem 

t 


V £ -ub+f(x*')  + ug(x*')  t=l,...,T 


(20) 


u > 0, 

because,  for  any  u > 0,  the  maximal  choice  v(u)  of  v is 

-ub  + minimum  {f(x*')  + ug(x^)}  = L(u) . 

• t=l,...,T 


Problem  (20)  is  usually  a large  scale  LP  because  the  number  T of  constraints 
can  easily  be  on  the  order  of  thousands  or  millions.  For  example,  in  the 
traveling  salesman  dual  problem  discussed  in  section  2,  T equals  the  number  of 


I 


i i 

i 1 
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d 


min  f fCx*")! 


s.t.  I g(x*^)X  ^ b 

t=l 


(21a) 


(21b) 


T 

^ X =1  (2lc) 

t=l 

X^  > 0,  t»l,...,T  (2ld) 

This  version  of  the  dual  problem  clearly  illustrates  the  nature  of  the 
convexification  Inherent  in  the  application  of  Lagrangean  techniques  to 
the  discrete  optimization  problem  (15). 

For  decomposable  discrete  optimization  problems  with  separable 
Lagrangeans  such  as  the  multi-item  production  scheduling  and  Inventory  control 

problem  (10),  the  dual  problem  in  the  form  (21)  has  a convexification  constraint 
(21c)  for  each  component  in  the  Lagrangean.  The  number  of  such  constraints  for 
the  production  scheduling  problem  is  I (the  number  of  items  being  scheduled) , 
and  the  number  of  joint  constraints  (21b)  is  T (one  for  each  time  period).  If 
I > T,  then  an  optimal  solution  to  (21)  found  by  a simplex  algorithm  will  have 
pure  strategies  for  at  least  I - T items;  that  is,  one  X variable  equal  to  one 
for  these  items.  If  I >>  T,  then  Lagrangean  techniques  give  a good  approxi- 
mation to  an  optimal  solution  to  (10)  because  pure  strategies  are  selected  for 
most  of  the  items.  Roughly  speaking,  when  I >>  T,  the  duality  gap  between  (10) 
and  its  dual  is  small. 


] 


Solution  of  the  dual  problem  in  its  LP  form  (20)  or  (2l)  can  be  accom- 
plished by  a number  of  algorithms.  One  possibility  is  generalized  linear 
programming,  otherwise  known  as  Dantzlg-Wolfe  decomposition  (see  Dantzlg 
and  Wolfe  (1960),  Lasdon  (1970),  or  Magnantl,  Shapiro  and  Wagner  (1976)). 
Generalized  linear  programming  proceeds  by  solving  (21)  with  a subset  of 
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the  T columns;  this  LP  problem  Is  called  the  Master  Problem.  A potential 
new  column  for  the  Master  Is  generated  by  finding  xeX  satisfying 
L(-i)  = + Ib+f  (x)-ifg(x) , where  it  < 0 Is  the  vector  of  optimal  LP  shadow  prices 
on  rows  (21b)  calculated  for  the  Master  problem.  If  LC-W)  < ib+6,  where  0 Is 
the  optimal  shadov?  price  on  the  convexity  row  (21c) , then  the  new  column 


Is  added  to  the  Master  with  new  X variable.  If  L(-Ti)  • iTb+?  (the  ">"  case 
Is  not  possible) , then  the  optimal  solution  to  the  Master  Is  optimal  In  the 
version  (21)  of  the  dual  problem. 

Note  that  If  we  required  X^  to  be  Integer  In  version  (21)  of  the  dual 

problem,  then  (21)  would  be  equivalent  to  the  primal  problem  (15).  Moreover, 

the  dual  solves  the  primal  problem  (15)  If  there  Is  exactly  one  X^  at  a positive 

level  In  the  optimal  solution  to  (20);  say,  X^  = 1,  X^  » 0,  t r.  In  that 

case,  x^eX  Is  the  optimal  solution  to  the  primal  problem  and  we  have  found  It 

by  the  use  of  Lagrangean  techniques.  Conversely,  suppose  more  than  one  X^  Is 

at  a positive  level  In  the  optimal  solution  to  (21),  say  Xj^  > 0,...,X^  > 0, 

X = 0,  t ^ r+1.  Then  In  all  likelihood  the  solution  T.  X x^  Is  not  In  X 

t=l 

since  X Is  discrete  and  the  dual  problem  has  failed  to  yield  an  optimal  solu- 

r 

tlon  to  the  primal  problem  (15).  Even  If  y = I there  Is  no 

f=l  ' 

guarantee  that  y Is  optimal  because  optimality  conditions  (11)  and  (111) 
can  fall  to  hold  for  y.  In  the  next  section  we  discuss  how  this  difficulty 
can  be  overcome,  at  least  In  theory,  and  In  section  5 we  discuss  the  use  of 
Lagrangean  techniques  In  conjunction  with  branch  and  bound. 


r 
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Generalized  linear  programming  has  some  drawbacks  as  a technique  for 
generating  Lagrange  multipliers  In  discrete  optimization.  It  has  not  per- 
formed consistently  (Orchard-Hays  (1968))  although  recent  modifications  of 
the  approach  such  as  BOXSTEP  (Hogan,  Marsten  and  Blankenship  (1975))  have 
performed  better.  A second  difficulty  Is  that  It  does  not  produce  monotonl- 
cally  Increasing  lower  bounds  to  the  primal  objective  function  minimum. 
Monotonlcally  Increasing  bounds  are  desirable  for  computational  efficiency 
when  Lagrangean  techniques  are  used  with  branch  and  bound.  A hybrid  approach 
that  Is  under  Investigation  Is  to  use  subgradient  optimization  on  the  dual 
problem  as  an  opening  strategy  and  then  switch  to  generalized  linear  program- 
ming when  It  slows  down  or  performs  erratically.  The  hope  Is  that  the  gen- 
eralized linear  programming  algorithm  will  then  perform  well  because  the  first 
Master  LP  will  have  an  effective  set  of  columns  generated  by  subgradient  opti- 
mization with  which  to  optimize. 

An  alternative  convergent  algorithm  for  the  dual  problem  Is  an  ascent 
method  based  on  a generalized  version  of  the  primal-dual  simplex  algorithm. 

We  present  this  method  mainly  because  It  provides  Insights  Into  the  theory  of 
nondlf ferentlable  optimization  which  Is  central  to  the  selection  of  Lagrange 
multipliers  for  discrete  optimization  problems.  Its  computational  effectiveness 
Is  uncertain  although  it  has  been  Implemented  successfully  for  IP  dual  problems 
(see  Fisher,  Northup  and  Shaplro(1975)  which  also  contains  proofs  of  assertions). 

The  Idea  of  the  primal-dual  ascent  algc..'lthm  can  best  be  developed  by 
considering  a difficulty  which  can  occur  In  trying  to  find  a direction  of 

ascent  at  a point  u with  positive  components  where  L Is  non-dlf ferentlable . 

1 2 

The  situation  Is  depicted  In  figure  1.  The  vectors  y and  y are  distinct 


subgradients  of  L at  u and  they  both  point  Into  the  half  space  containing 
points  u such  that  L(u)  ^ L(u).  Neither  of  these  subgradients  points  In  a 
direction  of  ascent  of  L;  the  directions  of  ascent  are  given  by  the  shaded 
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region  which  is  the  Intersection  of  the  two  half  spaces  {n( (u-u)Y^iO}  and 
{u|(u-G)Y^iO}. 

In  general,  feasible  directions  of  ascent  of  L at  a point  u can  be  dis- 
covered only  by  considering  at  least  implicitly  the  collection  of  all  sub- 
gradients  at  that  point.  This  set  is  called  the  subdifferential  and  denoted 
by  6L(u) . The  directional  derivative  VL(u;v)  of  L at  the  point  u in  the 
feasible  direction  v is  given  by  (see  Grlnold(1970)) 

VL(u;v)  = minimum  vy  ^ 

Ye6L(u) 


The  relation  (22 ) is  used  to  construct  an  LP  problem  yielding  a feasible 
direction  of  ascent,  if  there  is  one;  namely,  a feasible  direction  v such 
that  VL(u;  > 0.  Two  sets  to  be  used  in  the  construction  of  the 


direction  finding  LP  are 


V(u)  = \veR  0<v^<l  for  1 such  that  Uj^*0; 

^ “l<v^<l  for  1 such  that  Uj>0 


and 

T(u)  = {t|L(u)  = -ub  + f(x^)  + ug(x*')}. 

Without  loss  of  generality,  we  can  limit  our  search  for  a feasible  direction 
of  ascent  to  the  set  V(u).  The  subdifferential  6L(u)  is  the  convex  hull  of 
the  points  y^  “ g(x*')-b,  teT(u),  and  this  permits  us  to  characterize  the  direc- 
tional derivative  by  the  formula 

7L(u;v)  “ minimum  vy^  ^23) 

teT(u) 

If  the  non-negative  vector  u is  not  optimal  in  the  dual  problem,  then 
a direction  of  ascent  of  L at  u can  be  found  by  solving  the  LP  problem 
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V ■ max  V 

V £ , teT(u)  (24) 

veV(G). 

Conversely,  If  u Is  optimal  In  the  dual  problem,  then  V“0  is  the  optimal 
objective  function  value  In  (24) . Note  that  from  (23)  we  have 

V “ maximum  minimum  vy^ 
veV(u)  t*l,...,T 

* maximum  7L(u;v) 
veV(u) 

and  the  LP  (24)  will  pick  out  a direction  of  ascent  with  VL(u;v)  > 0 If 
there  Is  one. 

Once  an  ascent  direction  v f*  0 with  7L(u;v)  = V>0  has  been  computed 
from  (24) , the  step  length  0>O  In  that  direction  Is  chosen  to  be  the  maxi- 
mal value  of  6 satisfying  L(u+0v)  “ L(u)+0V.  This  odd  choice  of  0 Is  needed 
to  ensure  convergence  of  the  ascent  method  by  guaranteeing  that  the  quantity 
V strictly  decreases  from  dual  feasible  point  to  dual  feasible  point  (under 
the  usual  LP  non-degeneracy  assumptions) . This  is  the  criterion  of  the 
primal-dual  simplex  algorithm  which  In  fact  we  are  applying  to  the  dual 
problem  In  the  dual  LP  forms  (20)  and  (21). 

The  difficulty  with  problem  (24)  is  the  possibly  large  number  of  con- 
straints V < VY^  since  the  set  T(u)  can  be  large.  This  can  be  overcome  by 
successive  generation  of  rows  for  (24)  as  follows.  Suppose  we  solve  (24) 
with  rows  defined  for  Y*",  teT'(u)  - T(u)  and  obtain  an  apparent  direction 

v'  of  ascent  satisfying  V'  * minimum  v’y^  > 0.  We  compute  as  before  the 

teT'(u) 


maximal  value  0'  of  0 such  that  L(u+0v')  ■ L(u)+0V'.  If  0*  > 0,  then  we 
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proceed  to  u + 6 v.  If  0'  ■ 0,  then  It  can  be  ahown  that  we  have  found  as 
a result  of  the  line  search  a subgradient  Y*.  seT(u)  - T’(5),  which  satisfies 
7*  > v'y®.  In  the  latter  case,  we  add  v i vy*  to  the  abbreviated  version 
of  (24)  and  repeat  the  direction  finding  procedure  by  resolving  It. 

The  dual  problem  Is  solved  and  u Is  found  to  be  optimal  if  and  only  if 
V > 0 in  problem  (24) . Some  additional  insight  Is  gained  If  we  consider  the 
LP  dual  to  (24) 


m 

V • min  ^ 
1-1 


ieic(u) 


s.t. 


1 9 


(25) 


I - 0 

teT(a) 

teT(a) 


20,  s~  2 0,  sj  2 0, 

where  I(u)  - (l|u^  - 0).  Problem  (25)  states.  In  effect,  that  u Is  optimal 
In  the  dual  problem  If  and  only  If  there  exists  a subgradient  Yt^L(u)  satis- 
fying Yj^  ” 0 for  1 such  that  u^^  > 0 and  Y^  5 0 for  1 such  that  u^  - 0. 
Moreover,  the  columns  y^  for  t€T(u)  and  A^  > 0 are  an  optimal  set  of  columns 
for  the  dual  problem  In  the  form  (21) . 

The  close  relationship  among  the  concepts  of  dualizatlon,  convexlflcatlon 
and  the  differentiability  of  L is  again  evident.  Specifically,  a sufficient, 
but  not  necessary,  condition  for  there  to  be  no  duality  gap  Is  that  L Is 
differentiable  at  some  optimal  solution  u.  If  such  Is  the  case,  then  6L(u) 
consists  of  the  single  vector  Y-g(x)-b  which  by  necessity  Is  the  optimal 
column  In  (25).  The  necessary  and  sufficient  condition  for  dual  problem 
optimality  that  7-0  In  (25)  Implies  that  x satisfies  the  optimality  conditions 
implying  It  Is  optimal  in  the  primal  problem. 
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4.  Resolution  of  Duality  Gaps 

We  have  mentioned  several  times  In  previous  sections  that  a dual 
problem  may  fall  to  solve  a given  primal  problem  because  of  a duality  gap. 

In  this  section,  we  examine  how  Lagrange  techniques  can  be  extended  to 
guarantee  the  construction  of  a dual  problem  which  yields  an  optimal  solution 
to  the  primal  problem.  The  development  will  be  presented  mainly  In  terms  of 
the  zero-one  IP  problem  (1)  as  developed  In  Bell  and  Shapiro  (1976) , but  the 
theory  Is  applicable  to  the  general  class  of  discrete  optimization  problems 
(15).  The  theory  permits  the  complete  resolution  of  duality  gaps.  Neverthe- 
less, numerical  excesses  can  occur  on  some  problems  making  It  difficult  In 
practice  to  follow  the  theory  to  Its  conclusion.  The  practical  resolution 
of  this  difficulty  Is  to  Imbed  the  dual  analysis  In  branch  and  bound  as  des- 
cribed in  section  5.  Although  it  was  not  realized  at  the  time  it  was  Invented, 
the  cutting  plane  method  for  IP  (Gomory  (1958))  is  a method  for  resolving  IP 
duality  gaps.  We  will  make  this  connection  In  our  development  here.  Indicate 
why  the  cutting  plane  method  proved  to  be  empirically  Inefficient  and  argue 
that  the  dual  approach  to  IP  largely  supercedes  the  cutting  plane  method. 

We  saw  In  section  two  how  a dual  problem  to  the  zero-one  IP  problem  (1) 
could  be  constructed  using  a group  homomorphism  to  aggregate  the  equations 
Ax  < b.  The  relationship  of  this  IP  dual  problem  to  problem  (1)  can  be 
Investigated  using  the  duality  theory  developed  In  the  previous  section. 

Recall  that  the  Lagrangean  for  the  IP  dual  was  defined  for  u>0  as  (see  (14)) 

L(u)  = -ub  + minimum  {(c-hjA)x  + us}, 

(x,s)eY 


where 


n Di 

Y - {(x,s)|  a.x.  + ^ CjS.  - 6,  x.*0  or  1,  8.>0,1,2, . . . . (26) 

j-1  J J 1-1  ^ ^ J ^ 


Although  the  slack  variables  In  Y are  not  explicitly- bounded,  we  can  without 
loss  of  generality  limit  Y to  a finite  set,  say  Y ■ { (x*" ,8^)  This  Is 

because  any  feasible  slack  vector  s » b-Ax  Is  Implicitly  bounded  by  the  zero- 
one  constraints  on  x. 

The  general  discrete  optimization  dual  problem  In  the  form  (21)  Is 
specialized  to  give  the  following  representation  of  the  IP  dual  problem 

T 

d « min  2 (cx^)X 
t-1 


s.t.  2 )X^  * b 

t-1 


This  formulation  of  the  IP  dual  problem  provides  us  with  the  Insights 
necessary  to  make  several  Important  connections  between  Lagrangean  techniques 
and  the  cutting  plane  method.  The  convexlflcatlon  In  problem  (27)  can  be 


written  In  more  compact  form  as 


d “ min  cx 


s.t.  xe{x|Ax+Is»b,  0<Xj<l,  0<s^<M^}  H [Y], 

where  "[  ]'*  denotes  convex  hull  and  Is  the  upper  bound  on  the  slack  variable 
s^.  In  words,  the  IP  dual  problem  Is  effectively  the  problem  of  minimizing 
cx  over  the  Intersection  of  the  LP  feasible  region  with  the  polyhedron  [Y] . 
Inequalities  based  on  the  faces  of  [Y]  are  cuts,  and  there  will  generally  be 


an  extremely  large  number  of  them.  The  computational  Inefficiency  of  the  cutting 


plane  method  Is  due  In  large  part  to  the  algorithmic  ambiguity  created  by  this 
proliferation  of  cuts. 

Lagrangean  techniques  and  the  IP  dual  problem  provide  a rationale  for 
selecting  cuts,  but  In  the  process,  makes  the  use  of  cuts  largely  superfluous. 
For  any  u^O,  the  Inequality 

(c+uA)xfus  i L(u)+ub  (29) 

Is  a supporting  hyperplane  of  [Y].  Since  Y contains  all  feasible  solutions 
to  the  zero-one  IP  problem,  (29)  Is  a valid  cut  which  can  be  added  to  any 
LP  relaxation  of  the  problem  which  Included  the  constraints  Ax+Is  * b. 

Its  effect  on  an  LP  relaxation  would  be  to  ensure  that  the  objective  function 
value  cx  would  be  at  least  L(u)  (Shapiro  (1971)).  Thus,  the  strongest  cut 
In  terms  of  forcing  the  objective  function  to  Increase  Is  one  derived  from 
a dual  vector  u that  Is  optimal  In  the  dual  problem.  Furthermore,  the  pro- 
cedure for  selecting  a cut  according  to  this  criterion  Is  to  solve  the  dual 
problem  by  one  or  more  of  the  methods  of  the  previous  section  which,  as  we 
see  from  problem  (28),  Implicitly  considers  all  cuts  (i.e.,  all  faces  of  [Y]) 
without  generating  any  of  them. 

If  an  optimal  solution  to  the  dual  problem  produces  an  optimal  solution 
to  the  zero-one  IP  problem,  then  a cut  is  not  needed.  If,  on  the  other  hand, 
an  optimal  zero-one  solution  Is  not  produced,  then  a cut  of  the  form  (29) 
written  with  respect  to  an  optimal  dual  solution  u*  has  the  same  effect  on 
the  objective  function  as  all  the  cuts  Implied  by  [Y].  The  addition  of  such 
a cut  to  an  LP  relaxation  would  permit  the  IP  dual  analysis  to  continue  In 
the  sense  that  a stronger  IP  dual  of  the  form  (27)  could  be  derived.  However, 
the  construction  of  Bell  and  Shapiro  (1977)  attacks  more  directly  the  problem 


of  strengthening  the  IP  dual  when  It  does  not  produce  an  optimal  solution  to 
the  zero-one  IP  problem. 


f 


Solution  of  the  zero-one  IP  problem  (1)  by  Lagrangean  techniquea  la 

k 

constructively  achieved  by  generating  a finite  sequence  of  groups 
_ K 

seta  and  IP  dual  problems  analogous  to  (27)  with  objective  function 

Ic  Ic  lc'^1 

value  d . The  groups  have  the  property  that  G Is  a subgroup  of  G , Implying 

— lf+1  —If  k+1  k. 

by  the  construction  that  Y S Y and  therefore  that  v > d id.  The 
critical  step  In  this  approach  to  solving  (1)  Is  that  If  an  optimal  solution 
to  the  kth  dual  does  not  yield  an  optimal  solution  to  (1) , then  we  are  able 
to  construct  so  that  ? Y*^.  The  construction  uses  as  Its  point  of 

departure  the  following  result. 

Theorem  2 (Bell  and  Shapiro) ; If  only  one  la  positive  In  an  optimal 

basic  solution  to  (27),  then  the  corresponding  solution  (x*',8^)  Is  optimal 
In  the  zero-one  IP  problem.  On  the  other  hand.  If  more  than  one  X^  Is 
positive,  then  all  the  (x^,8*’)  corresponding  to  basic  X^  are  Infeasible 
In  the  zero-one  IP  problem. 

When  more  than  one  X^  Is  positive  In  an  optimal  basic  solution  to 
(27),  then  we  can  use  a number  theoretic  procedure  on  the  columns  In  (27) 
with  Xj,  positive  to  construct  a new  group  with  the  property  that  the  corres- 
ponding (x^,s^)  are  Infeasible  In  the  new  group  equation.  Thus,  they  are 
not  considered  In  the  Lagrangean  calculation.  Since  at  least  two  solutions 
are  eliminated  each  time  the  dual  Is  strengthened,  and  since  the  set  of  (x,s) 
to  be  considered  Is  finite,  the  process  converges  to  an  IP  dual  problem  of 
the  form  (27)  which  yields  an  optimal  solution  to  the  zero-one  IP  problem. 

Computational  experience  with  the  IP  dual  problem  (27)  Is  given  In 
Fisher,  Northup  and  Shaplro(1975) . D'Aversa(1977)  has  encoded  the  Iterative 
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A 


IP  dual  analysis  outlined  above  and  experimentation  Is  underway  with  It. 

The  IP  dual  approach  has  been  extended  to  mixed  IP  by  Bell (1977)  and 
Northup  and  Shapiro (1977) . Burdet  and  Johnson(1975)  have  applied  some  con- 
cepts from  convex  analysis  In  the  construction  of  IP  methods  which  bear  a 
resemblance  to  the  methods  Just  discussed. 


The  approach  just  outlined  Is  applicable  to  the  general  discrete  opti- 
mization problem  (21)  as  long  as  g(x*')-b  Is  a rational  vector.  If  more 
than  one  Is  positive  In  (21),  a group  structure  could  be  Induced  which 

would  exclude  Infeasible  solutions  x^  from  consideration  In  the  Lagrangean 
(16).  This  would  be  accomplished  by  Intersecting  X with  the  set  of 
solutions  satisfying  a group  equation  which  would,  however,  make  the  algorithm 
for  the  Lagrangean  more  complex.  See  Bell (1973)  for  an  application  of  this 
approach  to  the  traveling  salesman  dual  problem  to  maximize  the  Lagrangean  (12) . 


5.  Uses  of  Lagrangean  techniques  in  branch  and  bound 

Branch  and  bound  is  a method  guaranteed  to  find  an  optimal  solution 
to  the  general  discrete  optimization  problem  (15)  by  a systematic  search 
of  the  discrete  solution  set  X.  The  efficiency  of  the  search  Is  determined 
In  large  part  by  the  strength  of  the  bounds  used  In  limiting  It.  Bounds 
are  often  derived  from  LP  relaxations  of  a given  discrete  optimization 
problem  which,  as  we  have  seen,  arise  naturally  as  dual  problems  for  selecting 
Lagrange  multipliers.  Lagrangean  analyses  can  also  be  used  to  Indicate  the 
most  promising  variables  on  which  to  branch.  Conversely,  branch  and  bound 
can  be  viewed  as  a method  for  perturbing  a given  discrete  optimization 
problem  when  Lagrangean  techniques  fall  to  yield  an  optimal  solution  to  It. 


I 

i 
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We  describe  Che  integration  of  Lagrangean  techniques  with  branch  and 
bound  In  terms  of  the  general  discrete  optimization  problem  (15) . Our  develop- 
ment follows  closely  that  of  Fisher,  Northup  and  Shapiro  (1975).  The  branch 
and  bound  search  of  the  set  X Is  done  In  a non-redundant  and  Implicitly  exhaus- 
tive fashion.  At  any  stage  of  computation,  the  least  cost  known  solution 
xeX  satisfying  g(x)^b  is  called  the  Incumbent  with  Incumbent  cost  z=f(x). 
Branch  and  bound  generates  a sequence  of  subproblems  of  the  form 

v(X^)  = min  f(x) 

s.t.  g(x)  i b,  (30) 

xeX^, 

where  X^  s X.  The  set  X^  Is  selected  to  preserve  the  special  structure 
of  X.  If  we  can  find  an  optimal  solution  to  (30),  then  we  have  implicitly 
tested  all  subproblems  of  the  form  (30)  with  X^  replaced  by  X*"  £=  X^  and 
such  subproblems  do  not  h£^e  to  be  explicitly  enumerated.  The  same  conclusion 

k 

holds  if  we  can  ascertain  that  v(X  ) > z without  actually  discovering 
the  precise  value  of  v(X  ).  If  either  of  these  two  cases  obtain,  then  we 
say  that  the  subproblem  (30)  has  been  fathomed . If  It  is  not  fathomed,  then 
we  separate  (30)  into  new  subproblems  of  the  form  (30)  with  X replaced 

I 

by  X , 1*1,..., L , and 

U X*'  - x'^,  X*'l  n X*'2  = 4.,  y £2  • 

Lagrangean  techniques  are  used  to  try  to  fathom  the  subproblem  (30)  by 

solution  of  the  dual  problem 

%• 

d(X^)  “ max  L(u;X^) 
s. t.  u 2 0 , 
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where 


L(u;X  ; -ub  + minimum  {f(x)+ug(x)} 


The  use  of  (31)  In  analyzing  (30)  Is  Illustrated  In  figure  2 taken  from 
Fisher,  Northup  and  Shapiro  (1975)  which  we  now  discuss  step  by  step. 


Steps  1 and  2:  Often  the  Inltal  subproblem  list  consists  of  only 

one  subproblem  corresponding  to  X. 


Step  3: 


A good  starting  dual  solution  u > 0 Is  usually 


available  from  previous  computations. 


Step  4: 


Computing  the  Lagrangean  can  be  a network  optimization 


problem,  shortest  route  type  computation  for  Integer  programming,  minimum 
spanning  tree  for  the  traveling  salesman  problem,  dynamic  programming  shortest 
route  computation  for  resource  constrained  network  scheduling  problems,  etc. 

Step  5:  As  a result  of  step  4,  the  lower  bound  L(u;X*^)  on 

v(X*')  Is  available,  and  It  should  be  clear  that  (30)  Is  fathomed  If 

L(u;X^)  > z since  L(u;X^)  i v(X^)  . 

Steps  6,  7,  8:  Let  xeX  be  an  optimal  solution  In  (32)  and  suppose 

X Is  feasible,  l.e.  g(x)  < b . Since  (30)  was  not  fathomed  (step  5),  we 
have  L(u;X*')  * f(x)  + u(g(x)-b)  < z with  the  quantity  u(g(x)-b)  < 0 . Thus, 
It  may  or  may  not  be  true  that  f(x)  < z , but  If  so,  then  the  Incumbent  x 
should  be  replaced  by  x . In  any  case.  If  x Is  feasible,  we  have  by  the 
duality  theory  discussed  In  section  3 that  f(x)  + u(g(x)-b)  < v(X  ) s f(x) 
and  therefore  x Is  optimal  In  (30)  If  u(g(x)-b)  = 0 ; l.e..  If  complementary 


slackness  holds. 


Step  9: 


This  may  be  a test  for  optimality  In  the  dual  of  the 


current  u . Alternatively,  It  may  be  a test  of  recent  Improvement  In  the 


•Guns 


Figure  2 


dual  lower  bound.  If  generalized  linear  programming  Is  used  to  solve  the 

dual,  then  It  provides  at  each  Iteration  an  upper  bound  d on  d(X  ).  Thus, 

If  d < z , we  know  that  the  subproblem  (30)  will  never  be  fathomed  by 

bound  by  the  given  dual.  Finally,  as  we  Indicated  In  section  4,  If  the  given 

dual  problem  proves  unsatisfactory,  then  It  can  sometimes  be  strengthened 

depending  on  the  nature  of  the  primal  problem. 

Step  10:  The  selection  of  a new  u > 0 depends  upon  the 

methods  discussed  In  section  3 being  used  and  which  of  these  methods  have 

proven  effective  on  the  same  type  of  problem  In  the  past.  When  subgradient 

optimization  Is  used,  the  Incumbent  value  z can  be  used  In  place  of  d as 

the  target  value  In  selecting  the  step  length  (19) . The  rationale  for  this 

choice  Is  the  desire  to  fathom  (30)  by  bound  using  the  dual  by  finding 
- - k 

u i 0 such  that  L(u;X  ) > z . Computational  experience  has  shown  that  sub- 
gradient optimization  has  a good  chance  of  quickly  finding  such  a u if 
k 

d (X  ) Is  somewhat  above  z and  It  also  produces  monotonlcally  Increasing 

k “ 

lower  bounds.  Conversely,  If  d (X  ) < z and  z Is  used  as  the  target,  the 

lower  bounds  produced  by  subgradient  optimization  will  not  be  monotonlc  and 

a wobbling  pattern  will  be  observed.  In  the  latter  case,  persistence  with 

the  dual  (step  9)  Is  not  attractive. 

Steps  11,  12:  The  separation  of  the  given  subproblem  can  often  be 

done  on  the  basis  of  Information  provided  by  the  dual  problem.  For  example. 

In  Integer  programming,  the  problem  may  be  separated  Into  two  descendants 

with  a zero-one  variable  x^  set  to  zero  and  one  respectively,  where  Xj 

Is  chosen  so  that  the  reduced  cost  Is  minimal.  It  Is  Important  to  point  out 

that  the  greatest  lower  bound  obtained  during  the  dual  analysis  of  (30)  re- 

1 k 

mains  a valid  lower  bound  on  a subproblem  derived  from  (30)  with  X s x 
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In  step  12,  the  new  subproblem  selected  from  the  subproblem  list  can  be  one 
with  low  or  minimal  lower  bound. 

There  are  some  constructs  used  in  branch  and  bound  derived  from  or  related 
to  Lagrangean  techniques  which  we  will  not  cover  in  any  detail.  One  such  con- 
struct is  the  calculation  of  penalties  relative  to  a given  LF  relaxation  of  a 
discrete  optimization  problem  (see  Dakin(1965),  Drlebeek(1966) , Healy(1964), 
Tomlln(1971)) . A penalty  for  a zero-one  IP  problem,  for  example,  is  a lower 
bound  estimate  on  the  Increase  in  cost  of  the  primal  objective  function  value 
as  the  result  of  separating  the  IP  problem  by  fixing  a specific  variable  at 
zero  and  one.  Another  construct  is  the  surrogate  constraint  which  is  given 
in  the  form 

f(x)  + u(g(x)-b)  < z 

for  any  u i 0 (see  Geoffrlon(1969)  or  Glover (196S)) . The  idea  is  that  this 
constraint  can  be  added  to  (30)  since  any  feasible  solution  with  lower  cost 

A 

than  z will  satisfy  it.  The  constraint  has  a strong  effect  on  the  analysis 
of  subproblems  derived  from  (30)  if  u is  chosen  to  be  optimal  or  near 
optimal  in  the  dual  (31).  Geof frlon(1974)  discusses  in  greater  detail 
penalties  and  surrogate  constraints  from  the  Lagrangean  point  of  view. 

6.  Future  research  and  applications  areas  ^ 

We  have  seen  that  Lagrangean  techniques  have  already  been  widely  used 
to  analyze  discrete  optimization  problems.  Nevertheless,  further  progress 
should  be  possible  in  the  use  of  these  techniques,  particularly  in  their 
integration  with  branch  and  bound,  and  the  construction  of  fast  hybrid 


algorithms  for  solving  dual  problems.  We  saw  in  section  five  that  a family 
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1 

of  related  dual  problems  Is  generated  and  used  in  conjunction  with  branch 
and  bound.  The  relationship  between  these  duals  Is  Incompletely  understood 
as  are  methods  for  exploiting  the  relationship  In  their  optimization.  Some 

I 

work  In  this  direction  has  been  done  by  Marsten  and  Mbrln(1976) . They 

give  a new  way  to  use  linear  programming  to  compute  bounds  on  LP  relaxations 

In  branch  and  bound.  Specifically,  a resource-space  tour  Is  defined  such  i 

that  each  simplex  pivot  yields  a bound  for  every  unfathomed  subproblem  In  j 

i 

the  branch  and  bound  search.  ; 

Sensitivity  and  parametric  analysis  of  IF  problems  Is  an  area  of  current 
research  Interest  and  considerable  practical  Importance  In  which  Lagrangean 
techniques  can  play  a significant  role.  Geoffrlon  and  Nauss(1977)  give  an 
overview  of  the  work  done  thus  far  In  this  area.  Shaplro(1976)  discusses  | 

how  the  constructs  from  section  4 can  be  used  In  sensitivity  analysis. 

Multicriterion  IP  Is  a particularly  desirable  type  of  parametric  analysis 
which  has  not  yet  been  Implemented.  The  idea  would  be  to  use  the  branch  and 
bound  search  to  generate  a number  of  feasible  IP  solutions  i^ich  are  optimal 
or  near  optimal  under  various  objective  functions.  The  work  required  to  find 
a number  of  Interesting  mixed  IP  solutions  may  be  little  more  than  that  of 
finding  a single  optimal  solution.  Parametric  variation  of  the  right  hand 
side  Is  studied  by  Marsten  and  Morln(197S) . 

Another  recent  area  of  considerable  research  Interest  in  which  Lagrangean 
techniques  are  applicable  Is  In  the  analysis  of  heuristic  methods  for  combina- 
torial optimization.  Cornuejols,  Fisher  and  Nemhauser (1977)  develop  a 
"greedy"  heuristic  to  generate  feasible  solutions  to  a class  of  location  pro- 
blems and  use  Lagrangean  techniques  to  assess  the  error  In  objective  function 


optimality. 
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